Proteome-wide analysis reveals G protein-coupled receptor-like proteins in rice (Oryza sativa)

ABSTRACT G protein-coupled receptors (GPCRs) constitute the largest family of transmembrane proteins in metazoans that mediate the regulation of various physiological responses to discrete ligands through heterotrimeric G protein subunits. The existence of GPCRs in plant is contentious, but their comparable crucial role in various signaling pathways necessitates the identification of novel remote GPCR-like proteins that essentially interact with the plant G protein α subunit and facilitate the transduction of various stimuli. In this study, we identified three putative GPCR-like proteins (OsGPCRLPs) (LOC_Os06g09930.1, LOC_Os04g36630.1, and LOC_Os01g54784.1) in the rice proteome using a stringent bioinformatics workflow. The identified OsGPCRLPs exhibited a canonical GPCR ‘type I’ 7TM topology, patterns, and biologically significant sites for membrane anchorage and desensitization. Cluster-based interactome mapping revealed that the identified proteins interact with the G protein α subunit which is a characteristic feature of GPCRs. Computational results showing the interaction of identified GPCR-like proteins with G protein α subunit and its further validation by the membrane yeast-two-hybrid assay strongly suggest the presence of GPCR-like 7TM proteins in the rice proteome. The absence of a regulator of G protein signaling (RGS) box in the C- terminal domain, and the presence of signature motifs of canonical GPCR in the identified OsGPCRLPs strongly suggest that the rice proteome contains GPCR-like proteins that might be involved in signal transduction.


Introduction
The ability to sense the environment and respond appropriately to it is crucial for the survival of the organisms.2][3][4][5] The classical paradigm of GPCRmediated signaling initiates with the binding of ligands to their cognate GPCRs and induces conformational transformations in the receptor.Ligand-bound transitional GPCRs function as guanine nucleotide exchange factors (GEFs) for the GDP-bound Gα subunit promoting the exchange of GDP for GTP and culminating in the dissociation of activated Gα (Gα-GTP) and free Gβγ dimers. 6,7Activated Gα-GTP and free Gβγ dimers independently induce diverse downstream effector protein cascades including enzymes (adenylyl cyclases, Protein kinases etc.), ion selective channels (Ca + -K + channel, Na + -H + exchange), effector enzymes (phospholipase C and D) and secondary messengers (diacyl glycerol and inositol 1,4,5-trisphosphate). 4,[8][9][10][11] Post signal relay, deactivation of the G protein signaling pathway is mediated by the intrinsic GTPase activity of Gα-GTP to reconstitute the Gαβγ heterotrimer.The rate of Gα-GTP hydrolysis is G protein subfamilyspecific.8][19] The heterotrimeric G proteinmediated signal transduction pathway is transversely conserved in metazoans, and readily apparent in plants.
The human proteome is reported to harbor >850 GPCRs possessing homo/heterodimerization properties and has a rich repertoire of G proteins comprising 23 Gα, 5 Gβ, and 12 Gγ subunits [20][21][22] leading to over 1,300 theoretical heterotrimeric complexes.This large repertoire creates extensive signaling pathways to fulfill the response to diverse environmental stimuli.In sharp contrast, the number of known heterotrimeric signaling complex components in plants is dramatically lower.4][25] In plants, G protein functions are well-defined in fungal defense, [26][27][28] seed germination, 29,30 stomatal opening and closing [31][32][33] oxidative stress, 34,35 and sugar precipitation. 29,30,36However, no receptor for signal transduction from the environment to the cell has been reported.Whole-genome sequencing revealed that heterotrimeric G protein signaling and its complexes are highly complex.8][39][40] Studies of the few plant GPCRs characterized to date have provided evidence that plants use similar signaling components to regulate G protein-mediated signaling, although the signal inputs are different. 9-41-44he significantly reduced sequence homology (<25%) among GPCRs within a single GPCR family as well as across distinct GPCR families, impedes the identification of GPCRs in other life forms. 45Hence, the identification of novel GPCRs based on their two-dimensional topology, which classically consists of an extracellular amino terminus, seven membrane-spanning domains connected by three ECLs, and three ICLs and an intracellularly located C-terminal tail possessing the ability to couple with the Gα subunit of the heterotrimeric G protein, seems more appropriate.Recent advancements in whole-genome sequencing and comprehensive bioinformatics approaches have provided an alternative to expedite the mining of large genomes or proteomes for novel GPCRs.
The two most promising software programs for proteomewide mining of seven helical transmembrane proteins are: the quasi-periodic feature classifier (QFC) algorithm, 46 which is used for the identification of GPCRs and other multitransmembrane proteins from genome databases using statistically characterized local and global structural feature-based 'feature space' to determine the quasi-periodic physicochemical properties of multi-transmembrane proteins.7TMRmine 47 is the other powerful tool for filtering the whole proteome via alignment-free and alignment-based classifiers.It also helps to study the diversity and phylogenetic relationships between GPCRs.The signature functional features of GPCRs have been established by earlier studies.The topology of a protein is an important feature, and various bioinformatics platforms can be used to predict the topology of any membrane protein.HMMTOP, 48 Phobius, 49 and TMpred 50 are significant tools used for topology prediction.Improved transmembrane protein topology prediction tools with deep transfer learning reduce the dependence on evolutionary information. 51n the present study, we utilized pre-established experimental information and respective databases for the screening of putative GPCRs from the rice proteome.The experimental workflow was designed to address all known functional characteristics of GPCRs, starting from proteome-wide screening followed by topological analysis, functional analysis, biologically significant site prediction, motif scanning, and protein network analysis, to understand the fate of the proteins.The identified high-ranked GPCR-like proteins were tested for their interaction with the rice Gα subunit.Our study provides strong evidence for the presence of at least three 7TM noncanonical OsGPCRLPs in rice proteome that interact with rice Gα heterotrimeric subunit.

Proteome-wide computational mining of 7TM receptor(s)-like proteins
The proteome of O. sativa subsp.indica was scanned for hierarchical identification of heptahelical transmembrane receptor (7TMR) proteins on the 7TMRmine integrated web server (http://bioinfolab.unl.edu/emlab/7tmr/), 47which provides flexibility to integrate alignment-based and alignmentfree protein classifiers in any combination and hierarchical order (Figure 1).The alignment-based protein classification method relies on a homology search of query 7TMRs with wellcharacterized protein families and is less effective at identifying unrelated or novel 7TMRs.However, alignment-free classifiers are based on multivariate statistical methods that identify remote similarities and divergent 7TMRs with greater sensitivity. 47We used a combinatorial approach comprising alignment-based (Phobius, HMMTOP, and GPCRHMM) and alignment-free (SVM-AA, SVM-DI, PLS-ACC, KNN15, KNN20) classifiers to predict the most potential 7TMR candidates in the rice proteome.All the 'positive' filtered loci at each level were combined through the 'AND' approach and further subjected to topology analyses.

Functional annotation of high-ranked putative OsGPCRLPs
The PSIPRED (http://bioinf.cs.ucl.ac.uk/psipred/) workbench was used for functional annotation of the high-rank candidate OsGPCRLPs on FFPRED3, a platform for the prediction of gene ontology annotations. 52The amino acid sequences were used as the input data for FFPRED3.The output data included biological and molecular function and cellular component predictions.

Superfamily, family, and biologically significant site prediction in high-ranked OsGPCRLPs
The protein superfamily and family of high-ranked putative OsGPCRLPs were determined via Interpro (https://www.ebi.ac.uk/interpro/) and SUPERFAMILY 2.0 (http://supfam.org).InterPro provides functional analysis of proteins by classifying them into families and predicting domains and important sites using predictive models available on the InterPro consortium. 53SUPERFAMILY is an HMM-based database for structural and functional annotation of proteins and genomes. 54Biologically significant sites were analyzed on PROSITE (https://prosite.expasy.org/), a protein domain and family database to identify significant sites, patterns, and profiles. 55,56

Interactome mapping of high-ranked candidate OsGPCRLPs
The potential interacting partners of all high-ranked putative OsGPCRLPs were mapped on STRING v.11.5 57 with the O. sativa dataset.STRING is a database for the prediction of protein interaction networks based on established and predicted physical (direct) as well as functional (indirect) protein-protein interactions derived mainly from five data sources: genomic context predictions, high-throughput laboratory experiments, conserved co-expression, automated text-mining, and interactions aggregated from other primary databases.Interacting partners were mapped at a medium confidence score of 0.4, and there were 10 and 30, maximum interactions in 1 st and 2 nd shells, respectively.The interaction network obtained for the query proteins was then subjected to k-means clustering by selecting '3' clusters.

Coupling specificity prediction for high-ranked OsGPCRLPs
The coupling specificity of high-ranked putative OsGPCRLPs with four families of Gα proteins was predicted using the PRED-COUPLE 2.0 program (http://athina.biol.uoa.gr/bioinformatics/PRED-COUPLE2/). 58,59A safe cutoff of 0.3 was applied to discriminate between positive and negative predictions.Only loci with cutoff values above 0.3 were considered to have positive predictions.

GPCR-like proteins prediction in high-ranked OsGPCRLPs
Prediction of GPCR-like proteins for the high-ranked OsGPCRLPs was done using BLAST search performed for 'GPCRs only' on gpDB (http://bioinformatics.biol.uoa.gr/gpDB), a relational repository of known GPCRs, G proteins, and effectors classified into respective classes, families and subfamilies. 60The BLAST output lists the sequences in the database having significant E-values ranked by statistical significance.

Prediction of 'RGS box' architecture in high-ranked OsGPCRLPs
Multiple sequence alignment of the cytoplasmic C-terminal domain of high-ranked OsGPCRLPs was performed with human RGS19 (hRGS19; SwissProt accession no.P49795), Arabidopsis RGS1 (AtRGS1; GenPept NP_189238) and known plant GPCRs from Arabidopsis (AtGCR1; accession no.AAN15633.1, 38 B6GV49, 61 and pea (PsGPCR; NCBI protein id AAY30370.2; 43n Clustal Omega (https://www.ebi.ac.uk/Tools/msa/clustalo) under default settings.This alignment was performed to evaluate the presence of RGS boxes in high-ranked OsGPCRLPs and/or their close relatedness with known plant GPCRs.The C-terminal domains of high-ranked OsGPCRLPs and other plant GPCRs were generated by TMHMM 2.0, while the sequences used for AtRGS1 and hRGS19 were chosen according to Chen et al. 36

In vivo coupling validation of computational protein-protein predictions in high-ranked OsGPCRLPs
High-ranked candidate OsGPCRLPs were further validated through a membrane-based split-ubiquitin yeast-2-hybrid (MYTH) system 62 to establish their protein-protein interaction with the rice heterotrimeric G protein α subunit (RGA1).MYTH is a split-ubiquitin-based Y2H system specialized for membrane proteins that exploit the inherent ability of the two termini of the wild-type split ubiquitin molecule (NubI and Cub) to undergo spontaneous assembly followed by subsequent Ubiquitin binding protein (UBP)-catalyzed cleavage of transcription factor fused downstream of Cub.A single amino acid substitution of Ile with Gly at position 13 attenuates this spontaneity, which is then dictated by the interactions between the 'bait' and 'prey' proteins fused with Cub and Nub, respectively.Interactions between the bait and prey proteins facilitate the reconstitution of the split-ubiquitin heterodimer, which upon recognition by UBP, is cleaved at its C-terminus to release the transcription factor LexA/VP16.This cleaved transcription factor then translocates into the nucleus, where the DNA-binding domain of LexA binds to its operator upstream of reporter genes (in this case, His, Ade2, and lacZ), promoting VP16-mediated transcriptional activation of reporter genes.

Construction of plasmids
All the bait proteins (OsGPCRLP930, OsGPCRLP784, and OsGPCRLP630) were cloned and inserted into pTMBV-α (bait vector) at the two 'SfiI' restriction sites, ensuring inframe insertion with the downstream 'Cub-LexA/VP16' tag to encode the C-terminally tagged bait 'OsGPCRLP-Cub-LexA /VP16'.The stop codons of all bait genes were deleted to ensure the synthesis of the Cub-LexA/VP16 tag.Similarly, the prey protein RGA1 was cloned and inserted into pPR3Nα (prey vector) at the SfiI restriction sites such that the fusion protein NubG-RGA1, which is encoded by the prey vector, contains NubG at its N-terminus (Figure 2).The control prey vectors pOstNubI and pOstNubG, which encode a fusion of yeast ERlocalized membrane protein oligosaccharyl transferase to wildtype and modified 'Nub' yeast ubiquitin, respectively, were used as positive and negative controls. 63Cloning was performed using standard cloning procedures, and all the clones were confirmed through restriction profiling and sequencing.The sequences of primers used for cloning bait and prey proteins are listed in Table 1.

Expression of the plasmid constructs
The functional expression of the OsGPCRLPs and bait-prey interactions were established in the reporter strain NMY51 as described by Iyer et al. 64 with modifications.Yeast transformation was performed following the LiOAc/ ssDNA/PEG protocol. 65The bait and prey vector combinations (~1.5 µg each) used for cotransformation are listed in Table 2. Before total protein extraction, the yeast cells were treated with LiAc and NaOH. 66The isolated total protein was resolved through SDS-PAGE (10% resolving and 5% stacking), and the gel was stained with 0.25% Coomassie Brilliant Blue R-250.Furthermore, western blot analysis was performed to verify bait expression following the standard protocol.Primary anti-LexA antibody (1:1500) and  alkaline phosphatase-conjugated secondary anti-rabbit IgG (1:10000) were used.The functionality of the bait was validated via bait and pOstNubI (TI) coexpression.The probability of bait self-activation was assayed through the coexpression of bait with the negative control prey vector, pOstNubG (TG).Nontransformed yeast cells were used as experimental negative controls.Transformants were selected and screened for positive interactions on auxotrophic selection plates of increasing stringency (2D→3D→4D).To test the proper activation of reporter genes, 4D selection plates were supplemented with 0.5 mM 3-AT.Furthermore, the interaction strength for each of the three bait-prey combinations was assessed using a filter liftoff assay involving the activation of the lacZ reporter gene and color-based detection of β-galactosidase activity.

Proteome-wide mining identified 7TMR(s) in rice proteome
On alignment-based classifiers Phobius and HMMTOP identified all proteins as 7TMR positives, whereas, GPCRHMM classified 25 proteins as 7TMR positives, while remaining as negatives (Figure 3).The alignment-free classifiers viz.SVM-AA classified 2579 as 7TMR and 63,738 as non-7TMR proteins, SVM-DI as 2688 positives and 63,629 negatives, PLS-ACC as 5680 positives and 60,637 negatives, KNN15 as 6766 positives and 59,551 negatives, and KNN20 as 6848 positives and 59,469 negative 7TMRs (Figure 3).The combinatorial approach, including both alignment methods, identified 25 7TMR positive proteins that may be GPCRLP candidates.These candidates were considered for further analyses, and their loci, length and predicted signal peptides are listed in Table 3.

Functional annotation of high-rank OsGPCRLPs
All high-ranked OsGPCRLPs were shown to be involved in cell communication and signal transduction in response to various stimuli and are associated with cell surface receptor signaling pathways.Notably, the entire candidate OsGPCRLPs were identified as having biological and molecular functions in GPCR signaling pathways.However, the OsGPCRLP630 did not show GPCR activity unlike the other two candidates.All the candidate GPCRLPs were identified as intrinsic components of the plasma membrane.Unlike OsGPCRLP930 and OsGPCRLP630, the candidate OsGPCRLP784 was not reported to be a component of the nuclear outer membrane-ER membrane network.All the candidates exhibited significant probabilities and high (H) SVM reliability for the biological, molecular, and cellular component features (Table 5).and G protein-coupled receptor family 2 profile 2 (PS50261: PROSITE profiles).Similarly, the OsGPCRLP784 was categorized as a member of the THH1/TOM1/TOM3 family (IPR040226) with TOBAMOVIRUS MULTIPLICATION PROTEIN 1-LIKE ISOFORM X1 (PTHR31142, PANTHER) signature motif from amino acids 26-340.Additionally, amino acids 28-135 and 182-321 of putative OsGPCRLP784 represent the THH1/TOM1/TOM3 protein domain (InterPro: IPR009457 and Pfam: PF06454).However, the gene product of OsGPCRLP630 belonged to uncharacterized conserved protein UCP031277 (IPR016971) family with the UCP031277 (PIRSF031277; PIRSF) signature motif from amino acids 1-317.SUPERFAMILY 2.0 classified OsGPCRLP930 and OsGPCRLP784 under the Family A GPCR-like superfamily and Rhodopsin-like family with significant E-values, whereas no result was obtained for OsGPCRLP630 (Table 6).

Candidate OsGPCRLPs belonged to GPCR family possessing biologically significant sites
The ExPASy PROSITE database of protein families and domains identified various patterns and biologically significant sites in the OsGPCRLPs (Figure 4).In OsGPCRLP930, six N-myristoylation sites (

Interactome mapping of high-ranked candidate OsGPCRLPs revealed interactions with G protein-mediated signaling network partners
STRING analyses mapped 41, 31, and 34 interacting proteins for the OsGPCRLP930, OsGPCRLP784, and OsGPCRLP630 query proteins, respectively (Supplementary Tables S1-3).Proteins that are directly linked to GPCR functions are shown in Figure 5.
Interactome mapping generated a large number of interacting partners for each of the query proteins clustered in three distinct clusters, with colors indicating no functional priority mentioned in the supplementary tables .Proteins sharing similar types of interactions with the query protein are usually grouped into a single cluster.For OsGPCRLP930, 16 proteins were grouped in red and green, whereas 9 proteins were grouped in the blue cluster (Supplementary Table S1).The OsGPCRLP930 interacted with the G protein components viz.RGA1, RGB1, RGG1, RGG2, and the putative XLG of the red cluster (Figure 5(a)).However, the OsGPCRLP784 exhibited interactions with 9, 18, and 4 proteins grouped into red, green, and blue clusters, respectively (Supplementary Table S2).OsGPCRLP784 also interacted with G protein components similar to OsGPCRLP930, but not with XLG (Figure 5(b)).The interaction network of the OsGPCRLP630 contained 31 proteins in red, 1 protein in green, and 2 proteins in the blue cluster (Supplementary Table S3).The OsGPCRLP630 also interacted with all the G protein subunits, except RGG2 and XLG, as mapped in the red cluster (Figure 5(c)).In addition to interacting with G protein subunit components, OsGPCRLP630 and OsGPCRLP784 interact with the GPCR-type G protein (GTG) COLD1.In addition, two uncharacterized protein kinasedomain-containing proteins (B8B954 and B8BDL3) were also mapped to OsGPCRLP630.

Coupling specificity predicted Gα class-specific interactions
The normalized posterior probability scores of OsGPCRLP930 were 0.93, 0.67 and 0.38 for Gα i/o , Gα q/11 and Gα s , respectively.Similarly, the coupling specificity scores of OsGPCRLP630 were 0.78, 0.98, and 0.67 for Gα i/o , Gα q/11 , and Gα s , respectively.On the other hand, OsGPCRLP784 showed significant coupling scores of 0.97 and 0.78 for Gα i/o and Gα q/11 , respectively.All three candidate OsGPCRLPs were predicted to have negative coupling scores for Gα 12/13 , while OsGPCRLP784 additionally had a negative score for Gα s (0.1) (Table 7).

High-ranked OsGPCRLPs showed similarity with canonical GPCRs
A BLAST search on gpDB identified GPCR-like proteins for each high-ranked OsGPCRLP with significant E-values.Only the top five hits for each OsGPCRLP are shown in Table 8.The top five GPCRLP identified for OsGPCRLP930 included cAMP

The RGS box architecture was absent in putative OsGPCRLPs
Multiple sequence alignment showed that OsGPCRLPs do not contain RGS boxes in their cytoplasmic C-terminus and do not share any similarity with the cytoplasmic C-terminal domains of Arabidopsis and human RGS proteins.However, OsGPCRLP930 showed significant sequence similarity with the Arabidopsis and lotus GCR1 proteins, while OsGPCRLP784 and OsGPCRLP630 shared few consensus residues with the plant GPCRs considered under study (Figure 6).

Putative OsGPCRLPs strongly coupled with RGA1
All three OsGPCRLPs were cloned as bait proteins and assayed for functional expression and interaction with RGA1.Yeast cells co-transformed with the bait plasmid pTMBVα (930TMBV, 630TMBV, and 784TMBV) and prey plasmid (pPR3Nα-RGA1) showed prolonged growth of pinhead white to creamish colonies on 2D plates after one-week of incubation for all the selected vector combinations, indicating high transformation efficiency.Yeast cells expressing the positive controls (930TI, 630TI, and 784TI) and experimental constructs (930TN, 630TN, and 784TN) were positively selected on interaction selection media as evidenced by the robust growth of colonies on 3D and 4D selection plates (Figure 7a).The growth of yeast cells co-expressing 930TI, 630TI and 784TI confirmed the accurate expression, desired conformation and localization of the bait proteins.The appearance of indigo-colored bands on the BCIP-treated PVDF membrane in the case of all positive and positive test controls (TI and TN) further validated the expression of the functional bait proteins (Figure 7c).Substantial growth observed for 930TN, 630TN, and 784TN, suggest that positive bait-prey interactions led to reporter gene activation, thereby enabling the growth of host cells.No growth was observed in the negative controls, indicating appropriate selection of experimental controls.Subsequent growth of the selected 4D colonies on 4D +0.5 mM 3-AT supplemented media verified and eliminated the possibility of false positives.Furthermore, the filter-lift assay performed to assess the strength of the bait-prey interaction using X-gal, developed a blue color for the test (930TN, 784TN, and 630TN) and positive control samples (930TI, 784TI, and 630TI) only (Figure 7b).

Discussion
Plants have evolved sophisticated mechanisms to survive adverse conditions owing to their sessile habit.Membrane-localized heptahelical GPCRs and guanine nucleotide-mediated signal transduction are paramount mechanisms that enable plants to mitigate the temporal stresses and sustainably grow and develop.
Metazoans have an extensive repertoire of GPCR:G protein components that plants lack.As GPCR is a vital signaling component in various metazoans, the search for GPCR analogs in plants is of primary significance and was accelerated after the characterization of GCR1 in Arabidopsis. 44Since then, several genome studies have identified potential GPCR-like proteins in plants. 9-41-69rystallographic structures and MD simulation studies revealed significant similarities between plant Gαs, particularly AtGPA1, and the metazoan Gα i family, with notable differences in the flexibility of the helical domain as well as inter-and intradomain interactions.The observed relaxation and dynamic mobility of the helical domain in comparison to the respective Ras-like domain endorse plant Gα with a GEF-independent nucleotide-exchange ability.This spontaneity causes GTP hydrolysis rather than GTPbinding, the rate-limiting step of the G protein cycle in plants. 70hylogenetic studies advocate early eukaryotic origin and independent evolution of G proteins across eukaryotic lineages.This finding explains the differential receptor-G protein signaling architecture between opisthokonts and plants.While G proteins underwent greater expansion in metazoans, as evidenced by the multiplicity of heterotrimers, sequence divergence led to the origination of certain atypical Gαs in plants along with a finite set of canonical Gα, Gβ, and Gγ subunits.Presumably, these signaling elements enable plants to achieve the functional complexity integral to G protein signaling.Regardless, the functional dynamism of plant GPCR-G protein signaling is intriguingly comparable to that of animal systems.Maruta et al. 70 also proposed the existence of a ternary complex, wherein membrane-bound RGS and heterotrimers are associated with single transmembrane receptor kinases rather than 7TM receptors, suggesting multisite phosphorylationmediated co-regulation of the RGS protein and Gα in plants.
Other nonconventional aspects of the heterotrimeric G protein cycle, as speculated by specific plant phenotypes, include invariable independence of XLGs, nucleotide exchange-independent canonical Gα functions, and conditional heterotrimer dissociation.Thus, in the case of plants, the interactome of the heterotrimeric complex is not restricted to 7TM receptors and RGS proteins but also encompasses distinct non-7TM receptors and effectors that seemingly regulate the G protein switch through an intricate posttranslational code. 71In contrast, another compendium of research substantiates the occurrence of a subset of 7TM receptors in addition to RGS proteins in plants.However, the plausible variations in the canonical theme might be due to the obvious differences in the biology of plants and animal systems.Despite the reviewed presence of archetype G protein-mediated signaling mechanisms, the existence of canonical GPCRs in plants is highly controversial. 25,44,67,72,73ince the first report of a plant GPCR in Arabidopsis (GCR1), there have been continuous attempts to validate the presence of GPCR-type 7TM receptors in the plant kingdom.Additionally, AtGCR1 is the first plant GPCR that has been validated for interaction with GPA1. 44However, the intrinsic GEF activity of AtGCR1 has yet to be elucidated.The computational study by Gookin et al. 67 provides a pioneering approach to screening plant genomes for the identification of 7TM proteins with the typical architecture of a canonical GPCR using various bioinformatics platforms and prediction tools.Their study highlights the use of a bioinformatics pipeline exclusive to GPCR-mining in plants with sequenced genomes, and in vivo studies illustrated definite coupling between highly probable candidate GPCRs and AtGPA1.The sequence conservation of GPCRs is intrinsically as low as <25%, which depreciate the use of sequence homologybased identification of novel GPCRs. 45,74,75nvariably, GPCRs constitute the largest and most diverse family of type-I receptor proteins and have seven membranespanning helices with extracellular (ECL1-3) and intracellular loop (ICL1-3) regions.The laterally located N-terminus is free to interact with the diverse ligands, and the cytoplasm-located C-terminus interacts with the G protein heterotrimer (Gα, Gβ, and Gγ subunits) complex in a coupling-specificity-driven manner.The 3D spatial organization of intracellular regions along with the long C-terminal tail is highly dynamic and dictates the functional state of the receptor.Thus, topology prediction seems logical.
In the present study, we followed bioinformatics approaches to critically filter 7TM proteins from the rice proteome based on a topology prediction approach in the initial step in the quest for novel GPCR-like proteins using the 7TMRmine tool in combination with alignment-free and alignment-based algorithms with negligible disparities. 47The alignment-based methods recall proteins with significant similarity to known 7TMR subfamilies but fail to identify divergent 7TMRs.Such outliers can be sensitively identified using alignment-free methods.The combination of alignment-based and alignment-free methods facilitates the identification of a broader spectrum of 7TMR candidate proteins by complementing the limitations of each other.This integrated approach generated a pool of 25 proteins with putative 7TM topology (Table 3).Two-dimensional architecture and number of transmembrane domains on more stringent topology prediction platforms, viz.TMHMM2.0,SACSMEMSET, and TOPCONS sorted only three loci (LOC_Os06g09930.1,LOC_Os04g36630.1,and LOC_Os01g54784.1),out of 25 proteins, three have a typical GPCR-type 'type-I' heptahelical transmembrane topology, which in this study has been designated high-ranking candidate GPCR-like proteins.The idea behind the use of multiple tools for topological analysis based on complementary methods was to ensure the fidelity of the resulting data since the majority of such tools run on predictive models.
The overall results of superfamily, family and biologically significant sites prediction, as discussed in section 3.1.4,identified OsGPCRLP930 and OsGPCRLP784 as members of the established GPCR family based on conserved motifs.As per InterPro prediction, OsGPCRLP930 share similarity with the GCR1-cAMP receptor family (IPR022343: InterPro family) with the GCR1-cAMP receptor (PR02001: PRINTS) conserved motif and GPCR GCR1 family (IPR022340: InterPro family) with GCR1 (PR02000: PRINTS) signature motifs, while OsGPCRLP784 is a member of the THH1/TOM1/TOM3 family, which is also a part of the GPCR superfamily (Table 6). 76,77SUPERFAMILY2.0shows relatedness of both OsGPCRLP930 and OsGPCRLP784 with the established Family A GPCR superfamily and the Rhodopsin-like family.
OsGPCRLP630 belongs to an uncharacterized conserved protein family with currently no experimental data on the group or their homologs.Additionally, this group does not exhibit features indicative of any function (Interpro) and SUPERFAMILY 2.0 could not predict any family for this candidate (Table 6).All the posttranslational sites as enumerated by the PROSITE are typical of GPCRs and facilitate proper folding and localization.The presence of phosphorylation sites such as cAMP and cGMP-dependent phosphorylation sites, PKC, and CK II, confirms the occurrence of phosphorylation-mediated GPCR regulatory mechanisms. 78,79he presence of multiple protein kinase C phosphorylation sites and casein kinase II phosphorylation sites in OsGPCRLP930 and OsGPCRLP784 (Figure 4) seems to play key roles in the regulation of different cellular processes by inactivating GPCRs through feedback regulation by secondary messenger-activated kinases. 80Phosphorylation-mediated alterations in receptor conformation impair G protein heterotrimeric complex interactions.Such receptor regulation generally mediates 'heterologous' or nonagonist-specific deactivation.It occurs because stimuli that elevate cAMP (or diacylglycerol in the case of PKC) have the potential to phosphorylate and desensitize any GPCR containing a suitable PKC and/or PKA consensus phosphorylation site. 81The presence of casein kinase II phosphorylation sites reinforces the unidirectional desensitization of OsGPCRLPs after relaying the stimulus to downstream G protein-mediated signal transduction, as casein kinase II has been shown to phosphorylate substrates in hierarchical reactions. 82The presence of multiple N-myristoylation sites invariably aids in membrane anchoring, 83 and multispanning of OsGPCRLPs might be involved in the differentiation of extracellular stimuli.
N-glycosylation, in general, facilitates the folding and trafficking of membrane proteins and is critical for GPCR-type specific functions. 84For example, the functions of the receptors rhodopsin, 85 histamine H2, 86 and angiotensin II type-1 87 do not imply glycosylation.However, N-glycosylation reportedly has adaptable functions in the biosynthesis of β2 adrenergic receptors, 88 the modulation of ligand binding affinity in vasopressin V1a receptors, 89 and the regulation of G i -mediated signaling for P 2 Y 12 receptors. 90In addition to affecting receptor functions, glycosylation also affects receptor expression levels of some GPCRs, including rhodopsin, β2 adrenergic, angiotensin II type-1, and orphan Gpr176. 84The OsGPCRLP930 has two N-glycosylation sites, whereas OsGPCRLP784 and OsGPCRLP630 lack glycosylation sites.Presence of a leucine zipper motif which is required for dimerization, in the OsGPCRLP630 suggests its possible role in homo or heterodimerization. 91n addition to the 7TM scaffold, highly conserved consensus motifs and signature salt and disulfide bridges dispersed across various transmembrane helices and intracellular regions that precisely create GPCR-specific folds are other features of its structure.Fold recognition studies on AtGCR1 using class A, B, and F GPCRs as templates revealed that AtGCR1 has a GPCR fold that, more than any other observation (lack of GEF activity), verifies its status as a plant GPCR. 72The presence of definitive GPCR-fold specific residues and motifs in our highranked OsGPCRLPs strengthens their potential as true GPCR manifolds.Given that, RGS-encoding genes have been lost from the grass family owing to the Thr→Asn substitution in Gα in most monocots, 92 the sequence alignment of putative high-ranked OSGPCRLPs with known human and Arabidopsis RGS proteins revealed no significant similarities (Figure 6).Thus, the absence of the C-terminal RGS box  domain in the proposed putative OsGPCRLPs indicate that these 7TMR proteins are not members of the 7TM-RGS protein family.Moreover, the same sequence alignment highlights the significant similarity in OsGPCRLP930, which strongly suggests that rice proteome has GPCRLPs similar to Arabidopsis GCR1 and lotus GPCRs. 44,61Multiple sequence alignment of the OsGPCRLP630 and OsGPCRLP784 annotated them as RGS domain-lacking 7TM proteins.Thus, all this rationalized evidence strongly indicates that the identified high-ranked OsGPCRLPs might be relatives and new members of the limited plant GPCR repertoire.Pertinent evidence suggests that the identified OsGPCRLPs might be the members of noncanonical GPCR families.Analyses of their probable molecular and biological functions and subcellular localization and gene ontology annotations indicated their active role in cell communication and potential association with cell surface receptor signaling pathways.Notably, the OsGPCRLP930 and OsGPCRLP784 were identified as integral components of GPCR signaling pathways.However, GPCR signaling-oriented functions were not directly predicted in the case of OsGPCRLP630 (Table 6).
Moreover, a 7TM protein may function as a canonical GPCR if it possesses the intrinsic ability to induce conformation-driven nucleotide exchange in the Gα subunit of the heterotrimeric G protein complex.In pursuit of this feature in our high-ranked GPCRLPs, our analysis of the STRING database showed that all three OsGPCRLPs interacted with all typical GPCR-interacting proteins (Figure 5).An interactome map of OsGPCRLP930 revealed that it interacts with a putative XLG (STRING: OsJ_19863) along with the classical G protein subunits D1, RGB1, RGG1 and RGG2 (Figure 5a).Additionally, the OsGPCRLP784 and OsGPCRLP630 were shown to interact with typical heterotrimeric components with the additional interacting partner GPCR-type G protein COLD1 (Figure 5b,c, respectively).Interestingly, COLD1 is a plasma membrane and endoplasmic reticulum-localized regulator of G protein signaling (RGS) that bestows low temperature tolerance in rice 42,69,93 and was reported to be a functional substitute for the classical RGS proteins. 94However, the interactome map of the OsGPCRLP630 was variable at medium confidence, and RGG2 was not mapped in the interaction network.Moreover, two unknown kinase domain- The interaction of Gα with GPCRs is not random but is specified by the dynamic spatial organization of the intracellular domains of GPCRs, which are known to have intrinsic GEF activity.The coupling specificity of GPCR and Gα thus provides selectivity for downstream effectors and signaling routes to be activated.Plant Gα subunits are the closest homologs of the G i -type Gα family.Generally, GPCRs function in a Gα-dependent manner; that is, a given GPCR is more likely to interact with and activate a single subtype of the G protein family.Nonetheless, this specificity is not uniform, as numerous GPCRs have been shown to couple with more than one Gα-type in animals. 95The absence of sequence homology and consensus sequence motifs for G protein coupling in GPCRs associated with the same G protein subtype suitably supports the fact that, upon binding of a ligand, the 3D conformation switching by the three intracellular loops along with the C-terminal tail of GPCRs governs G protein selectivity.These transitions, when propagated to the flexible switch regions in Gα, facilitate the  nucleotide exchange-triggered dissociation of the Gβγ heterodimer. 4Studies on GPCR:G protein selectivity suggest that a large subset of nonolfactory GPCRs exhibit selectivity for G i/o -type Gα (43%), followed by 33% of G q/11 -type and 25% of G s -coupled GPCRs. 95We observed a similar tendency of Gα coupling specificity for the OsGPCRLP930 and OsGPCRLP784, with the highest coupling specificity for G i/o -type Gα, followed by G q/11 .However, unlike that of OsGPCRLP930, the coupling specificity of OsGPCRLP784 with G s is negligible.Moreover, OsGPCRLP630 exhibited the highest Gα-selectivity for G q/11 -type (Table The search for animal orthologs of candidate OsGPCRLPs also strongly supported that the identified OsGPCRLPs are strongly related to well-known animal GPCR families (Table 8).
Affirmative indications from the computational analysis were strongly supported by the in vivo coupling assay between high-ranked OsGPCRLPs and RGA1 using the MYTH assay exclusively modified for membrane proteins.The prolific growth of yeast cells coexpressing the OsGPCRLPs and RGA1 under stringent levels of auxotrophic selection, 4D -SD agar supplemented with 0.5 mM 3-AT, strongly indicates a positive and strong interaction between the OsGPCRLPs expressed in the native conformation and native RGA1 (Figure 7).
In conclusion, rigorous bioinformatics analysis and strong positive interactions between the identified subset of OsGPCRLPs and RGA1 suggest that the rice genome encodes a subset of 7TM receptor-like proteins that are structurally and functionally at par with canonical GPCRs.It also provides promising information regarding the extension of the GPCRmediated signaling repertoire in plants.However, by observing the existing evidences and self-activation paradox of Gα, the possible intrinsic GEF activities of these novel rice GPCRLPs and their roles in the biochemical mechanism of the RGA1 activation/deactivation cycle can be examined.

Figure 2 .
Figure 2. Schematic line diagram representing the construct design for MYTH assay.

Figure 4 .
Figure 4. Biologically significant sites in high-ranked OsGPCRLPs analyzed on PROSITE.

Figure 5a .
Figure 5a.Interactome mapping for high-ranked OsGPCRLPs on String v.11.5.5a representing the interactome map of OsGPCRLP930 for 40 proteins.5b representing the interactome map for OsGPCRLP784 for 30 proteins.5c representing the interactome map for OsGPCRLP630 for 33 proteins.Each map includes one query protein.The K-means clustering function is distributed for all proteins in three clusters (Red, Green, Blue) (Supplementary data).

Figure 7 .
Figure 7.In vivo membrane-yeast-two-hybrid (MYTH) assay for OsGPCRLPs (630, 784 and 930) with RGA1. A. NMY51 yeast strain non-transformed negative control (TG), positive control (TI) and test (TN), plated on 4-dropout selection media supplemented with 0.5 mM 3-AT.B. Filter lift assay of plate a showing strength of interaction between high-ranked OsGPCRLP baits and RGA1 as prey.C. Western blot assay of plate a showing the specificity of the bait proteins.
lotus (LjGCR1; UniProtKB/TrEMBL Flowchart detailing the computational analysis of candidate GPCRLPs in rice.

Table 1 .
Primers used for cloning of OsGPCRLPs in MYTH vectors.

Table 2 .
Vector combinations used in transformation of reporter strain NMY51.
Figure 3. Proteome-wide mining for heptahelical transmembrane GPCRLPs on 7TMR mine.branetopology on all three topology prediction tools under study and were classified as high-ranking candidate GPCRLPs.Similarly, 7 loci were categorized as medium-ranked candidate GPCRLPs that met the above criteria according to at least

Table 3 .
Proteome-wide identification of candidate GPCRLPs in rice on 7TMRmine server.
OUT = N-terminus is extracellularly located; IN = N-terminus is intracellularly located; Shaded loci possess typical GPCR-type 'type-I' heptahelical transmembrane topology.A trace amine receptor_TaR-9 of Rattus norvegicus, and human olfactory receptors (6V1, 52E6, and 4A4 of class A of odorant receptor family in pairwise alignment on gpDB.

Table 6 .
Prediction of superfamily and family of candidate OsGPCRLPs.

Table 7 .
G protein coupling specificity predication on PRED-COUPLE 2.0 tool.

Table 8 .
BLAST output of high rank OsGPCRLPs with gpDB database.BLAST search output showing only top five GPCR-like proteins identified at gpDB with significant E-values ranked by statistical significance for each high-ranked OsGPCRLP.